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Abstract. We introduce a random matrix model where the entries are dependent across both rows and 
columns. More precisely, we investigate matrices of the form X = )„+,),, € W x " derived from a 
linear process X, = Z; c vZ f -/, where the {Z,j are independent random variables with bounded fourth 
moments. We show that, when both p and n tend to infinity such that the ratio p/n converges to a finite 
positive limit y, the empirical spectral distribution of p~'XX T converges almost surely to a deterministic 
measure. This limiting measure, which depends on y and the spectral density of the linear process X t , is 
characterized by an integral equation for its Stieltjes transform. The matrix p~'XX T can be interpreted 
as an approximation to the sample covariance matrix of a high-dimensional process whose components 
are independent copies of X,. 



1. Introduction 

Random matrix theory studies the properties of large random matrices A = (A,j) ; y e K pxn , for 
some field K. In this article, the entries Ay are real random variables unless otherwise specified. 
Commonly, the focus is on asymptotic properties of such matrices as their dimensions tend to infinity. 
One particularly interesting object of study is the asymptotic distribution of their singular values. 
Since the squared singular values of A are the eigenvalues of AA T , this is often done by investigating 
the eigenvalues of AA T , which is called a sample covariance matrix. The spectral characteristics of 
a p x p matrix 5 are conveniently studied via its empirical spectral distribution, which is defined 
as F s = p~ l Y?j = \ 8a,\ here, [A\, . . . , A p ) are the eigenvalues of S , and 8 X denotes the Dirac measure 
located at x. For some set BcR, the figure F s (B) is the number of eigenvalues of S that lie in B. The 
measure F s is considered a random element of the space of probability distributions equipped with 
the weak topology, and we are interested in its limit as both n and p tend to infinity such that the ratio 
p/n converges to a finite positive limit y. 

The first result of this kind can be found in the remarkable paper of Marchenko and Pastur 111411 . 
They showed that F p AA converges to a non-random limiting spectral distribution F p AA if all Ay 
are independent, identically distributed, centred random variables with finite fourth moment. Interest- 
ingly, the Lebesgue density of F p AA is given by an explicit formula which only involves the ratio y 
and the common variance of Ay and is therefore universal with respect to the distribution of the entries 
of A. Subsequently ll^l El. the same result was obtained under the weaker moment condition that 



the entries Ay have finite variance. The requirement that the entries of A be identically distributed has 
later been relaxed to a Lindeberg-type condition, cf. Eq. ®). For more details and a comprehensive 
treatment of random matrix theory we refer the reader to the text books Anderson et al. [ 1], Bai and 
Silverstein Mehta 
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Recent research has focused on the question to what extent the assumption of independence of 
the entries of A can be relaxed without compromising the validity of the Marchenko-Pastur law. In 
Aubrun it was shown that for random matrices A whose rows are independent R' ! -valued random 
variables uniformly distributed on the unit ball of l q (W), q > 1, the empirical spectral distribution 

F p ' aaT still converges to the same law as in the i. i. d. case. The Marchenko-Pastur law is, however, 
not stable with respect to more substantial deviations from the independence assumptions. 

A very useful tool to characterize the limiting spectral distribution in random matrix models with 
dependent entries is the Stieltjes transform which, for some measure p, is defined as the map s M : 
C + — > C + , s M (z) = L(t - z)~ l fi(dt). A particular, very successful random matrix model exhibiting de- 
pendence within the rows was investigated already by Marchenko and Pastur [ 14] and later in greater 
generality by Pan [ 17], Silverstein and Bai 12011 : they modelled dependent data as a linear transforma- 
tion of independent random variables which led to the study of the eigenvalues of random matrices 
of the form AHA 1 , where the entries of A are independent, and H is a positive semidefinite popula- 
tion covariance matrix whose spectral distribution converges to a non-random limit F H . They found 
that the Stieltjes transform of the limiting spectral distribution of p~ l AHA J can be characterized as 
the solution to an integral equation involving only F H and the ratio y = lim p/n. Another approach, 
suggested in Bai and Zhou [4] and further pursued in Pfaffel and Schlemm 111811 - is to model the rows 
of A independently as stationary linear processes with independent innovations. This structure is in- 
teresting because the class of linear processes includes many practically relevant time series models, 
such as (fractionally integrated) ARMA processes, as special cases. The main result of Pfaffel and 
Schlemm fl8ll shows that for this model the limiting spectral distribution depends only on y and the 
second-order properties of the underlying linear process. 

All results for independent rows with dependent row entries also hold with minor modifications for 
the case where A has independent columns with dependent column entries. This is due to the fact that 
the matrices AA T and A T A have the same non-zero eigenvalues. 

In contrast, there are only very few results dealing with random matrix models where the entries are 
dependent across both rows and columns. The case where A is given as the result of a two-dimensional 
linear filter applied to an array of independent complex Gaussian random variables is considered in 
Hachem et al. 1 1011 . They use the fact that A can be transformed to a random matrix with uncorre- 
cted, non-identically distributed entries. Because of the assumption of Gaussianity the entries are 
in fact independent, and so an earlier result by the same authors [11] can be used to obtain the as- 
ymptotic distribution of the eigenvalues of p~ l AA*. In the context of operator-valued free probability 
theory, Rashidi Far et al. Ill9ll succeeded in characterizing the limiting spectral distribution of block 
Wishart matrices through a quadratic matrix equation for the corresponding operator-valued Stieltjes 
transform. 

A parallel line of research focuses on the spectral statistics of large symmetric or Hermitian square 
matrices with dependent entries, thus extending Wigner's [23] seminal result for the i. i. d. case. Mod- 
els studied in this context include random Toeplitz, Hankel and circulant matrices |(| l8 l [l5|. and 
references therein] as well as approaches allowing for a more general dependence structure i2L ll2ll . 

In Pfaffel and Schlemm 11811 . the authors considered sample covariance matrices of high-dimen- 
sional stochastic processes, the components of which are modelled by independent infinite-order mov- 
ing average processes with identical second-order characteristics. In practice, it is often not possible 
to observe all components of such a high-dimensional process, and the sample covariance matrix can 
then not be computed. To solve this problem when only one component is observed, it seems reason- 
able to partition one long observation record of that observed component of length pn into p segments 
of length n, and to treat the different segments as if they were records of the unobserved components. 
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We show that this approach is valid and leads to the correct asymptotic eigenvalue distribution of the 
sample covariance matrix if the components of the underlying process are modelled as independent 
moving averages. 

We are thus led to investigate a model of random matrices X whose entries are dependent across 
both rows and columns, and which is not covered by the results mentioned above. The entries of 
the random matrix under consideration are defined in terms of a single linear stochastic process, see 
Section for a precise definition. Without assuming Gaussianity we prove almost sure convergence 
of the empirical spectral distribution of p~ l XX J to a deterministic limiting measure and characterize 
the latter via an integral equation for its Stieltjes transform, which only depends on the asymptotic 
aspect ratio of the matrix and the second-order properties of the underlying linear process. Our result 
extends the class of random matrix models for which the limiting spectral distribution can be identified 
explicitly by a new, theoretically appealing model. It thus contributes to laying the ground for further 
research into more general random matrix models with dependent, non-identically distributed entries. 
Outline. In Section^ we give a precise definition of the random matrix model we investigate and state 
the main result about its limiting spectral distribution. The proof of the main theorem as well as some 
auxiliary results are presented in Section [51 Finally, in Section 0, we indicate how our result could be 
obtained in an alternative way from a similar random matrix model with independent rows. 
Notation. We use E and Var to denote expected value and variance. Where convenient, we also write 
H\^x and fi2,x for the first and second moment, respectively, of a random variable X. The symbol 
l m , ra a natural number, stands for the rax m identity matrix. For the trace of a matrix S we write 
tr S . For sequences of matrices (S n ) n we will suppress the dependence on n where this does not cause 
ambiguity; the sequence of associated spectral distributions is denoted by F s , and for their weak 
limit, provided it exists, we write F s . It will also be convenient to use asymptotic notation: for two 
sequences of real numbers {a n ) n , {b n ) n we write a n - 0{b n ) to indicate that there exists a constant C 
which is independent of n, such that a n < Cb n for all n. We denote by Z the set of integers and by 
N, R, and C the sets of natural, real, and complex numbers, respectively. 3z stands for the imaginary 
part of a complex number z, and C + is defined as {z € C : 3z > 0}. The indicator of an expression £ 
is denoted by 7(g) and defined to be one if £ is true and zero otherwise. 



2. A NEW RANDOM MATRIX MODEL 

For a sequence (Z,) /€ z of independent real random variables and real coefficients (cy); e Nu{0)> the 
linear process (X t ) te z and the pxn matrix X are defined by X t = Zylo c j^t-j anc ^ 



X — (X^),y — (X(j-\) n+t )i t 



X n +\ 



y X(p-\)n+\ 



x„ 

Xln 



X 



(1) 



pn ) 



The interesting feature about this matrix X is that its entries are dependent across both rows and 
columns. In contrast to models considered in JKHEi], not all entries far away from each other are 
asymptotically independent, e. g., the correlation between the entries X i f? and X i+ u, i = 1, . . . ,p — 1, 
does not depend on n. We will investigate the asymptotic distribution of the eigenvalues of p~ l XX J as 
both p and n tend to infinity such that their ratio p/n converges to a finite, positive limit y. We assume 
that the sequence {Z t ) t satisfies 



lZ t = 0, EZ f = 1, and cr 4 := sup EZ/ < oo, 



(2) 
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and that the following Lindeberg-type condition is satisfied: for each e > 0, 

-£e(z^ en) )^0, as n^oo. (3) 

; t=\ 

Condition (0) is satisfied if all {Z t \ are identically distributed, but that is not necessary. As it turns out, 
the limiting spectral distribution of p~ l XX T depends only on y and the second-order structure of the 
underlying linear process X t , which we now recall: its auto-covariance function y : Z — > R is defined 
by y(h) = EX^X^ = Z°l cjcj + \h\; its spectral density / : [0, 2n] — > R is the Fourier transform of y, 
namely f(oS) = 2Zhez7( n ) e ~ lhbJ ■ The following is the main result of the paper. 

Theorem 1. Let X t = zZJ=q c jZt-j, t £ Z, be a linear stochastic process with continuously differen- 
tiable spectral density f, and let the matrix X £ M. pxn be given by Eq. (0). Assume that 

i) the sequence (Z t ) t satisfies conditions (0) and (0), 

ii) there exist positive constants C, 5 such that \cj\ < C(j + l)" 1 " 5 , for all j e N U {0}. 

Then, as n and p tend to infinity such that the ratio p/n converges to a finite positive limit y, the empiri- 
cal spectral distribution of p~ l W J converges almost surely to a non-random probability distribution 
F with bounded support. The Stieltjes transform z <-* Sp(z) of F is the unique mapping C + — > C + 
satisfying 

-z + y 7r— — r-dw. (4) 



sp(z) ' Jo 1 + f(oj)sp(z) 

Remark 1. The assumption that the coefficients (cj)j decay at least polynomially is not very restrictive; 
it allows, e. g., for X, to be an ARMA or fractionally integrated ARM A process, which exhibits long- 



range dependence H9L I13I1 . In the latter case the entries of the matrix X are long-range dependent as 
well. 

Remark 2. It is possible to generalize the proof of Theorem Q] so that the result also holds for non- 
causal processes, where X t - 2ZJ=~oo c j^t-j- The required changes are merely notational, the only 
difference in the result is that the auto-covariance function is then given by zZJ = -oo c j c j+\h\- 

The distribution F can be obtained from sp via the Perron-Frobenius inversion formula |5, The- 
orem B.8], which states that for all continuity point < a < b of F, it holds that F([a,b]) - 

rb 

lim e _>o+ J a $Sp(x + ei)dx. In general, the analytic determination of this distribution is not feasible. 
It is, however, easy to check that for the special case of independent entries one recovers the classical 
Marchenko-Pastur law. 

3. Proof of Theorem [TJ 

The strategy in the proof of Theorem [j] is to show that the limiting spectral distribution of p _1 XX T 
is stable under modifications of X which reduce the sample covariance matrix to the form p~ l ZHZ J , 
for a matrix Z with i. i. d. entries, and some positive definite H. To this end we will repeatedly use 
the following lemma which presents sufficient conditions for the limiting spectral distributions of two 
sequences of matrices to be equal. 

Lemma 2 (Trace criterion). Let Ai t „, A2, n be sequences of p xn matrices, where p = p n depends on 
n such that p n — > oo as n — > oo. Assume that the spectral distribution F' '•" ' " converges almost 
surely to a deterministic limit F' ■" as n tends to infinity. If there exists a positive number e such 
that 



i) p- 4 E[tr(Ai,„ -A 2> „)(Ai >n -A 2 ,„) T ] 2 = 0(n^) 

ii) p~ 2 EtrAi^Aj n = 0(1), i = 1,2, and 
Hi) p-*YanrA Un A] n = 0(n x ~% i = 1,2, 



then the spectral distribution of p Ai^A\ n is convergent almost surely with the same limit F F •" . 

Proof. The claim is a direct consequence of Chebyshev's inequality, the first Borel-Cantelli lemma, 
and Bai and Silverstein [5, Corollary A.42] □ 

With the constants C and 6 from assumption ^) of TheoremO] we define Cj := C(j+ 1)~ 1_<5 , such that 
\cj\ < Cj for all j. Without further reference we will repeatedly use the fact that j i-> cj is monotone, 
that c°j is finite for every a > 1, and that ^J =n cj is of order 0(n l - a{ - l+S) ). Since it is difficult 
to deal with infinite-order moving averages processes directly, it is convenient to truncate the entries 
of the matrix X by defining X t = J^j =0 CjZ t -j and X = (X(j-i) n +t)if5 this is different from the usual 
truncation of the support of the entries of a random matrix. 

Proposition 3 (Truncation). If the empirical spectral distribution of p~ l XX J converges to a limit, 
then the empirical spectral distribution of /? -1 XX T converges to the same limit. 

Proof. The proof proceeds in two steps in which we verify conditions Q) to [H3) of Lemma[2. 
Step 1. The definitions of X and X imply that 



1 ~ _Tl P " ~ 2 1 P '" °° 

1 r i=l t =l t i, t =l k,k'=n+\ 

We shall show that the second moment of A x ^ is of order at most n~ 2 ~ 2S . Since 

CO CO 

^ 1 ^\Z(i-i) n +t-kZ(i-l)n+t-k'Z(i'-l)n+t'-mZ(i>-l)n+f-m'\ k*|[c^||c m [|c m '| <0"4 V |c&| 



,=n+l 



fc=0 



4 

< oo, (5) 



we can apply Fubini's theorem to interchange expectation and summation in the computation of 

j CO 

/^2,A X x : " ^xx~ ~~ 4 2 2 ^[Z(i-l)n+t-kZ(i-l) n +t-k'Z(i'-l) n+ f- m Z(i'-i) n+ f- m ']ckCi ( fC m C m '. (6) 

' =1 ' ,=n+l 
f,r m,m 

Since the {Z,} are independent, the expectation in that sum is non-zero only if all four Z are the same 
or else one can match the indices in two pairs. In the latter case we distinguish three cases according 
to which factor the first Z is paired with. This leads to the additive decomposition 



■ 1 1 1 n n rrh im 

f*2,A x z = /i 2 ,A^ + /i 2 ,A^ + ^2,A X x + /^2,A X x > ( 7 ) 

where the ideograms indicate which of the four factors are equal. For the contribution from all four Z 
being equal it holds that k = k', m = m', and (i — \)n + t — k — (i' — \)n + f - m, so that 



p 

11X1 - V V V 

(,/' t,f=l m=max{n+l,n+l-(i-i')n-(f-f')) 



)n+(f-f )+m C m • 
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If we introduce the new summation variables 5, := i - i' and 8 t := t -t', we obtain 

p-l n-l oo 



^ X .x =7 Z 2 Z 



„2 2 
~m+5in+8 t c m' 



<P 



6 t =l—n <n m=max{n+l,n+l— 5,n— 5 ( } 



If (5j is positive, then 5,72 + 5 t is positive as well; the fact that \cj\ is bounded by cj and the monotonicity 
of j i — > cy imply that c^ +5-n+fi < C(<5 ; _i) n c,5 f+ „ so that the contribution from 5,- > 1 can be estimated as 



p-l 



2n-l 



rm,+ 0^4" V- V - V -2 -o/ -4-3<K 

^2,A XX - TT Zj C (' 5 '- 1 )" Zj CS > Zj C '" = 0( " } " 
^J^_s Sj=l 6,=l m=n+l 

=0(n- 2 ) =0(J| _ W) =0(1) =0 (»-i-M) 

An analogous argument shows that the contribution from 8i < -1, denoted by u . '~, is of the same 
order of magnitude. The contribution to J u'7T l _ from 5; = is given by 

Z ' A X,X 



n-l 



rm, _ojn sry v~< 2 2 °"4» 

^2,A XX „3 Zj Zj C m C m+(5, - 3 

i5 f =l-« m=max{«+l,n+l-(5 f } ^Z^—' 

=o(«- 2 ) 



n-l 



2 Z^ Z ^ + 2 

5,=1 m=n+l 



ct 



m=n+ 1 

0(1) =0(n- 1 - M ) =0(/)- 3 - 415 ) 
-3-25\ 



o(«- 3 - 2<5 ). 



By combining the last two displays, it follows that [i~ J is of order d ). The second term in 

1-1 AA x,x 
Eq. (|7D corresponds to k = k', m = m , and (i - l)n + 1 - k + Qf — V)n + 1 - m. The restriction that not 

all four factors be equal is taken into account by subtracting /JJJ^ ; consequently, 



1 p n w 



t,i'=l /,/'=! &,m=n+l 



=0(1) 



=0(n- 2 - 4s ) 



It remains to analyse fi 2 . which, by symmetry, is equal to /i 2 A _ . If the first factor is paired with 

the third, the condition for non-vanishment becomes k - m + (i- i )n + t-t',k' - m' + (i - i')n + t — t f , 
and m + m'. Again introducing the new summation variables 8{ := i - i' and 6 t := t-t', we obtain that 



j p-l n-l 00 

*C5a = * 2 Z Z 



CmCm'Cm+6jn+6tCm'+din+d t A 



<5i=l-p 



5 t =l—n <n m,m'=max{«+l,«+l-(5,«-(5 f } 



As in the analysis of // 9 ! we obtain the contribution from 5, ^ as 



rrh,+ 



n=h, 

^2,A V ; 



p-l 



2«-l 



- ~ Zj C( - S ~ l > Zj ° 5 > Zj c " ,Cm ' +// 2,a xx = °(" )■ 



(8) 



5;=1 



<5,= 1 fn,m'=fi+l 



=0(n" 2 ) 



=0(n-!-«) =0(1) 



=0(n^ 2 *) 
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Finally, for the contribution from 5j - one finds that 

n-\ 



rrh,0 



< 



<5 f =l-n m,m'=max{n+l,n+l-<5,| 



= 0(fT 2 ) 

The last two displays ®) and 



2 E^. E 



E 



CmCm' ' 

6 t =l m,n'=n+l m,n'=n+\ 



-2-2 



=0(«- M ) 



imply that //^J 1 



=0(n- 

n=h.- 



+ p ? 



^2,A VX ' ^2,A, 



■^2,« 



= 0( 7 r 2 - 215 ). 



(9) 



rrh,+ 

2,Aw 



= <9(t2~ 2 - 2(S ). Thus, 



j"2,a X x i s °f order 0(n" 2 ~ 215 ), as claimed. 

Step 2. Next we verify assumptions 0) and |hj) of Lemma 0, which means that we show that both 
Ex := p~ 2 tr XX T and E^ := /?~ 2 tr XX T have bounded first moments and variances of order n~ l ~ £ , for 
some e > 0; in fact, e will turn out to be one. For Ex we obtain 

^ p n 00 

^l,Sx : ~ EZx = ~ 2 XiXl ^ [Z(i-l)n+t-kZ(i- 1 )n+;-A' 

where the change of the order of expectation and summation is valid by Fubini's theorem. Using 
Eq. and Fubini's theorem, the second moment of Ex becomes 



1 k=Q 



y p n oo 

mxx '■= ES I - — Zj Zj e [ z (!'-i>i+?-^ z (!-i 



1 )n+t-k' Z(i>-l) n +t> -mZ(i> -\)n+t' -m' 



] ClcClc'C m C n 



m,m' 



:0 



This sum coincides with the expression analysed in Eq. (|6|), except that here the k, k', m, m' sums start 
at zero, and not at n + 1. A straightforward adaptation of the arguments there show that p2Xx equals 

n 2 p~ 2 (Zfclo c it) + 0{n~ 2 ), and, consequently, that VarEx = j"2,z x ~~ C"i,i x ) 2 = 0(n" 2 ). Analogous 
computations show that EE^ is bounded, and that VarE^ - 0(n~ 2 ). Thus, conditions 0) and [hit) of 
Lemma0 are verified, and the proof of the proposition is complete. □ 

Because of Proposition [3] the problem of determining the limiting spectral distribution of the sam- 
ple covariance matrix p~ l XX T has been reduced to computing the limiting spectral distribution of 
p~ l XX T , where now, for fixed n, the matrix X depends on only finitely many of the noise vari- 
ables Z t . The fact that the entries of X are finite-order moving average processes and therefore 
linearly dependent on the Z f allows for X to be written as a linear transformation of the i. i. d. ma- 
trix Z := (Z(i-2) n +t)i=i„..,p+i,t=i„..,n- We emphasize that Z, in contrast to X and X, is a (p + 1) x n matrix; 
this is necessary because the entries in the first row of X depend on noise variables with negative 
indices, up to and including Zi_„. In order to formulate the transformation that maps Z to X concisely 

in the next lemma, we define the matrices K n = \ n I e R nxn , as well as the polynomials 

Xniz) = co + ciz + . . . + c n f and^„(z) - z n x(\lz) = c n + c n -\z + ...+ c Q z n . 



Lemma 4. With X, Z, K n and Xn, Xn defined as before it holds that 

Xn{K J n ) 
Xn (K n ) 



[° h h o]( z °) 



(10) 
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Proof. Let : R N — > R. N be the right shift operator defined by %(vi, . . . , Vjy) = (0, vi, . . . , vjv_i) and 
for positive integers r, s denote by vec riS : W xs — » R™ the bijective linear operator that transforms a 
matrix into a vector by horizontally concatenating its subsequent rows, starting with the first one. The 



operator S , 



is then defined as S r<s = vec r s os rs o vec, vv . This operator shifts all entries 



of a matrix to the right except for the entries in the last column, which are shifted down and moved 
into the first column. For k - 1,2,..., the operator S* s is defined as the &-fold composition of S r>s . In 

the following, we write S := S p +i >n . With this notation it is clear that X = 

[ l p ]xn(S)Z. In order 

to obtain Eq. (floh . we observe that the action of S can be written in terms of matrix multiplications 
as SZ - K p+ \ZE + ZKj, where the entries of the n x n matrix E are all zero except for a one in the 
lower left corner. Using the fact that E(Kj) m E is zero for every non-negative integer m it follows by 
induction that S , k = 1, . . . , n, acts like 

k 

S k Z =z(K r n f + K p+l Z £ (KjfE(Kj)" 1 = [ 



ip+i K p+ i 



Z 







z 



K n-k 



This implies that 
X = [0 l p ][ 
and completes the proof. 



p+\ K p+ i 



](o z)t, Ck 



n—k 



[° 



z 







z 



Xn(K J n ) 

Xn (K n ) 



While the last lemma gives an explicit description of the relation between Z and X, it is impractical 
for directly determining the limiting spectral distribution of p _1 XX T . The reason is that Z appears 
twice in the central block-diagonal matrix and is moreover multiplied by some deterministic matrices 
from both the left and the right. The LSD of the product of three random matrices has been computed 
in the literature H25TI . but this result is not applicable in our situation due to the appearance of the 
random block matrix in Eq. (floh . Sample covariance matrices derived from random block matrices 
have been considered in Rashidi Far et al. lfl9ll . However, they only treat the Gaussian case and, more 
importantly, do not cover the case of a non-trivial population covariance matrix. We are thus not aware 
of any result allowing to derive the LSD of p~ 1 XX J directly from Lemma0. 

The next proposition allows us to circumvent this problem. It is shown that, at least asymptotically 
and at the cost of slightly changing the size of the involved matrices, one can simplify the structure of 
X so that Z appears only once and is multiplied by a deterministic matrix only from the right. 



Proposition 5. Let Z, K n andxn, Xn be as before and define the matrix X := ZQ. e r(p +1 ) x (" +1 ) ; where 

Xn+l \^n+l) a n « "'+' 1 
Xn+l (Kn+l) 



Q 



[ l n 1„ ] 



(11) 



If the empirical spectral distribution of p 'XX T converges to a limit, then the empirical spectral 
distribution of p~ l XX J converges to the same limit. 



»(p+l)x(n+l) 



Proof. In order to be able to compare the limiting spectral distributions of p l XX T and p [ XX T 

- [00 

in spite of their dimensions being different, we introduce the matrix X = ~ 

U X 

Clearly, F p ' xx = (p + 1)~%) + p(p + l)~ l F p ' xxT , which implies equality of the limiting spectral 
distributions provided either of the two, and hence both, exists. It is therefore sufficient to show that 

the LSD of p~ l XX J and p~ l XX are identical; this will be done by verifying the three conditions 



of Lemma 0- The remainder of the proof will be divided in two parts. In the first part we check the 
validity of assumption 0) about the difference X - X, whereas in the second one we consider the terms 

trXX T andtrXX , which appear in conditions 0) and lift). 
Step 1. Using the definitions of X and X, it follows that 



p+l n+l 



A x,x 'A tr (X - X) (X - X) T = 1 £ £ [X i7 - X u f 



1=1 y=l 



2 P+l n+l 

-^2 Z Z 

^ i=2 7=2 



Z : 

k,k'=j 



-{i-2)n+k^(i 



-2)n+k' c j-k+n+l c j-k'+n+l + Z((_3) n+ jfcZ ( 



(('-3)n+&'C/-&+n-l£' j-Jf +B-1 



p+l « 



n+l J-l 



"I 1j Z(i-2)„+kZ(i-2)n+k' C n -k+2C n -k' +2 + — ^ ^ ^-n+k^-n+k'C j-k-lC j-k' -I 



P i=l it,/t' = l 
n+l n 



;=2 fc,jfc'=l 



"I 2 Zl ^1 Z-n+k^-n+k' c j-k+n+l c j-k'+n+l ~ : X* 

/J 7=2 W=7 



(12) 



i=i 

where the elementary inequality (a + Z?) 2 < 2a 2 + 2& 2 was used twice. In order to show that the 
variances of expression (I12T) are summable, we consider each term in turn. For the second moment of 
the first term of Eq. we obtain 



x.x 



p+l n+l n-j+l n—j'+l 
i,i'=2 7,7' =2 k,k'=l m,m'=l 



2 4 P +i "+ 1 n -)+ i n ~J + l 
^AP_j = — ^ ^ ^ XT E [Z(,-_ 1 )„_£+ 1 Z(f_ 1 )„-&' + 1 Z(,v _l)„_ m+ l Z(,v_i)„_ m / + i ] 



C j+kCj+k' C j'+mC f +m' 



As before we consider all configurations where above expectation is not zero. The expectation equals 
o"4 if i - i' and A:, m, m! are equal, and, hence, 



2 (n+l \ 2 



^,a2l * -7? Z Z Z c ?+* * -¥ Z ^ Z ^ 



(=2 jfe=l V7=2 



fc=l V7=2 



= 0(«- 3 ). 



The expectation is one if the four Z can be collected in two non-equal pairs. The first term equals the 
second, and the third equals the fourth if k - k' and m - m' , and thus 



. p+l n+l n-j+l n-j' + l 

nn -_VVV V 2 2 _rm__i 

^2,A~- ~ C 4 Zj Zj Zj Zj C j+k C j'+m V 2J £)_ ~ 2 
xx ^ i,;'=2 7,/ =2 fc=l m=l x . x 7 



'n+l n-7+1 ^ 2 

Z Z 4* 

V7=2 k=l 



rm j0/ -2x 
~V 2d a)_ = 0(n ). 

' x,x 



Likewise, the contribution from pairing the first factor with the third, and the second with the fourth, 
can be estimated as 



rm 

x,x 



4 p+1 M+1 " 4 
- "^Z Z Z l c 7+* c 7+*' c 7'+^7'+*'l + ^ 3 

^ i'=2 jj'=2k,k' = l ' xx ^ 



n+l "\ 

Z^ 

7=1 



Obviously, the configuration [tin can be handled the same way as above. Thus we have 

' XX ' x,x 

shown that the second moment of AP_, the first term in Eq. (fl2f) . is of order n~ 2 . This can be shown 
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for the second term in Eq. (fl2h in the same way. We now consider the second moment of the third 
term in Eq. (fl2h : 



^2A (3) _ :_ ^(^^y) ~ ~~ 4 ^[Z(i-2)n+kZ(i-2)n+k'Z^>-2)n+mZ(i>-2)n+m']c n -k+2Cn~k'+2Cn 
r i,l '=1 k,k' _, 



p+1 n 



Distinguishing the same cases as before, we have //' ' ' ( ' 3) - <X4^r- lZ'k=i c t-k+2 ~ ®( n ^ an ^> tnus ' 

' x,x 



nn _ (P + I) 2 ' 

x,x f 



•jfc+2 



-^ 2A (3) ), 

' x,x 



as well as fP^m = fP^L - 0(n 3 ). Thus, the second moment of the third term in Eq. (fl2h is of 

' XX ' x,x 



order 0(n z ); repeating the foregoing arguments, it can be seen that the second moments of A__ and 
A^_, the two last terms in Eq. (Tl2h . are of order 0(n~ 2 ) as well, so that we have shown that 

- - — n 2 2 5 

tr (X - X) (X - X) = E (A Sf ) < 5 ^ A* 2> a»_ = 0(n~ 2 ). 

i=i xx 

Step 2. In this step we shall prove that both := /?~ 2 tr XX T and £^ := /?~ 2 tr XX T have bounded first 
moments, and that their variances are summable sequences in n, i. e. we check conditions ^) and^) of 

Lemma0. Since trXX is equal to tr XX T , the claim about has already been shown in the second 
step of the proof of Proposition ^ For the first term one finds, by the definition of X, that 



P 
2 



j p+1 n+l ' j-\ n 

=— 2 ^ Z(i-2)n+kCj-k-\ + ^ Z(,-_2)n+/fcC/-/t+n+l 

( =1 ;=1 U=l k=j 
f P+1 n+1 

Z(i-2)„+kC j-k-lZ(i-2)n+k' C j-k> -I 

P i=l 7 =1 ^' = 1 
2 P+1 n+l fi 

+ — J Z( ! -_2) n+/ tCy_fe +n+ iZ(i_2)„ +/ f Cy_jf +n+ i =: + S^. 



X 

p+l n+l j-1 



Clearly, the first two moments of iL are given by 



p lx (l) := ElP = — ^ 2j E [ Z 0'-2)«+«: Z ((-2)n+<:'] C j- k -\C j-W -\ = - ^ ^ C 2 _j, 



n+l /'-I 



(=1 ;=1 fc,fc' = l 



7=1 *=1 



and 



^2,zi 1) * - ^(^^) "^E E li 2 ^ Z ( i '- 2 )"+' tZ (''- 2 )' l +' t ' Z ( i ''- 2 )' l + mZ ( ! ''- 2 )' , +'«') > 

C y'-^- 1 C j-k' —\Cj'-m—\Cj'-m'-\- 



p+l n+l ;'-l /-l 



i,i'=l j,f=l k,kf =1 fn,m'=l 
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We separately consider the case that all four factors are equal, and the three possible pairings of the 
four Z. If all four Z are equal, it must hold that i = i' , k = k' = m = m' , with contribution 



_ 4o~4 V V V 2 2 

^2j£> " n 4 Zj Zj Zj C j-k-\ C j'-k-\ 



p+1 n+l minU/)-l 



x P 

4o- 4 (p + 1) 



<- 



i=i j,f=i k=\ 

n+l min(7,/}-l 
/\ ^i'-min{7,/}C/-min{y,/) ^ < A | 
A/'=l 



_ 2 4cr 4 (p + 1) V 1 - - V -2 
c,,_, < : 2j c O c \j-j'\ Zu c k-V 



n+l 



Introducing the new summation variable dj := j - f , one finds that 



k=\ 



X 



4cr 4 (p + l)(rc + 1)_ 
a c o 



2^Li = o(«- 2 ). 



(13) 



The first factor being paired with the second, and the third with the fourth, means that k-k',m-m', 
and m t (i - i')n + k, so that the contribution of this configuration is given by 



p+l n+l j-1 j'-l 



^ =7 Z Z Z Z c )-k-A- m -\ -^3> = + 0{n ~ 2y 

x " /,/' = 1 j,j' = l k=\ m=l x 



(14) 



For the n=h pairing, the constraints are i = i', k = m, k' = m',k ± k', and the corresponding 
contribution is 



n=h 



p+\ n+l min{j,/}-l 



^2£ (1) =7A Zj Zj Zj c i-k-l c j-k' -\Cf-k-\Cf-e-\ -A* 2S w 

' X " i= i yj/ = 1 fcfc^j ' X 

4(p + l)^ mto{ ^' M 



^ 4(p + lXn + 1) _ 

^ 3 C 



^ 1 Cj-mm{j,j'\Cf -min{ j,f] Cfc-lC/f-l + 0(« 2 ) 



c ° + 2 Z cs j 

6j=l 



^ + 0(«~ 2 ) - 0(«" 2 ). 



(15) 



k,k'= l 



Renaming the summation indices shows that /i n ( i) = ft a r Combining this with the displays (13) 

' X ' X 



to (1151) . it follows that Var = // 2 £ u) -/i^ = 0(n l ). Since a very similar reasoning can be applied 



,2 



to Sr } , and p~ A VartrXX 1 is smaller than 2 VarEiy + 2 VarEr\ we conclude that p~ 4 VartrXX 1 is 
X xx 

of order 0(n~ 2 ). □ 

The intention behind Proposition^ was to allow the application of results about the limiting spectral 
distribution of matrices of the form ZHZ J , where Z is an i. i. d. matrix, and H is a positive semidefinite 
matrix. Expressions for the Stieltjes transform of the LSD of such matrices in terms of the LSD of H 
have been obtained by Marchenko and Pastur [14], Silverstein and Bai |20|], and, in the most general 
form, by Pan lFl7n . The next lemma shows that in the current context the population covariance matrix 
H has the same LSD as the auto-covariance matrix T of the process X t , which is defined in terms of 
the auto-covariance function y(h) = T,J=o c j c j+\h\ by T = (y(i - this correspondence is used to 
characterize the LSD of H by the spectral density / associated with the coefficients (cj)j. 
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Lemma 6. Let Q. be given by Eq. (I lib . The limiting spectral distribution of the matrix Q.QJ exists and 
is the same as the limiting spectral distribution of the auto-covariance matrix T. It therefore satisfies 

J h{A)F nnT {dA) = i- j h(J{(o))Ao), (16) 
for every continuous function h. 

Proof. The first claim follows by standard computations from the fact that Q is, except for one missing 
row, a circulant matrix with entries Q,-y - c„ +i _,- mo( j („+i), and Bai and Silverstein [5, Corollaries A.41 
and A.42]. The second claim is an application of Szego's limit theorem about the LSD of Toeplitz 
matrices; see Szego l2ll Theorem XVIII] for the original result or, e. g., Bottcher and Silbermann J3, 
Sections 5.4 and 5.5] for a modern treatment. □ 

Proof of Theorem \A According to Proposition H, the matrix XX T is of the form ZOn T Z T , where Q 
is given by Eq. (fllh . Using Pan [ 17, Theorem 1] and the fact that, by Lemma|, the limiting spectral 
distribution of QQ T exists, it follows that the limiting spectral distribution F p xxT exists. Therefore, 
the combination of Propositions landd shows that the limiting spectral distribution of p ! XX T also 



exists and is the same as that of p 'XX 1 . Pan [17, equation (1.2)] thus implies that the Stieltjes 
transform of F p xx is the unique mapping Sp p -i a j : C + — > C + which solves 



1 z+ y f i — r~ — -*°° T «u), 

Jv, 1 



(Z) Jr 1 + As fip -i XK T(z) 

and Eq. (flfih from Lemma completes the proof. □ 

4. Sketch of an alternative proof of TheoremQ] 

In this section we indicate how Theorem Q] could be proved alternatively using the methods em- 
ployed in Pfaffel and Schlemm [18]. We denote by X^) the matrix which is denned as in Eq. (Q]) but 



with the linear process being truncated at [n a \ with < a < I, i.e. X( a ) = fe^o c jZ(i-i)n+t-j) . ■ If 

I I ^ 

1 - a is sufficiently small, then an adaptation of the proof of Proposition |3| to this setting shows that 
/j _1 XX T and /? X( a )XjT, have the same limiting spectral distribution almost surely. The next step is to 

partition X( a ) into two blocks of dimensions px ln a ] and px(n-[n a ]), respectively. If we denote these 

two blocks by X} a) and i. e. X (a) = X^], then clearly X (a) Xj a) = X\ a) (X^ ' +X* a) (xfj , 
and an application of Bai and Silverstein [5, Theorem A.43] yields that 

* ^ ank (^)(x ( y T ) < imin(l^J,/») = 0(p-\ a ) ^ 



sup 

ieR> 



F' rlxx \[0,A]) - F p ~ 1X i>( X t«>) ([0,A]) 



p v ^ v / p 

It therefore suffices to derive the limiting spectral distribution of p -1 XL)(XjjK) . Since the matrix 

Xt, has independent rows, this could be done by a careful adaptation of the arguments given in 
Pfaffel and Schlemm flUl . We chose, however, to provide a self-contained proof, which also provides 
intermediate results of independent interest like Proposition H, and we therefore omit the lengthy 
details of this alternative proof. 
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